home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
AmigActive 24
/
AACD 24.iso
/
AACD
/
Programming
/
gcc-2.95.3-3
/
info
/
g77.info-18
< prev
next >
Encoding:
Amiga
Atari
Commodore
DOS
FM Towns/JPY
Macintosh
Macintosh JP
Macintosh to JP
NeXTSTEP
RISC OS/Acorn
Shift JIS
UTF-8
Wrap
GNU Info File
|
2001-07-15
|
47.8 KB
|
1,148 lines
This is Info file f/g77.info, produced by Makeinfo version 1.68 from
the input file ./f/g77.texi.
INFO-DIR-SECTION Programming
START-INFO-DIR-ENTRY
* g77: (g77). The GNU Fortran compiler.
END-INFO-DIR-ENTRY
This file documents the use and the internals of the GNU Fortran
(`g77') compiler. It corresponds to the GCC-2.95 version of `g77'.
Published by the Free Software Foundation 59 Temple Place - Suite 330
Boston, MA 02111-1307 USA
Copyright (C) 1995-1999 Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided also
that the sections entitled "GNU General Public License," "Funding for
Free Software," and "Protect Your Freedom--Fight `Look And Feel'" are
included exactly as in the original, and provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.
Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that the sections entitled "GNU General Public
License," "Funding for Free Software," and "Protect Your Freedom--Fight
`Look And Feel'", and this permission notice, may be included in
translations approved by the Free Software Foundation instead of in the
original English.
Contributed by James Craig Burley (<craig@jcb-sc.com>). Inspired by
a first pass at translating `g77-0.5.16/f/DOC' that was contributed to
Craig by David Ronis (<ronis@onsager.chem.mcgill.ca>).
File: g77.info, Node: Adding Options, Next: Projects, Prev: Service, Up: Top
Adding Options
**************
To add a new command-line option to `g77', first decide what kind of
option you wish to add. Search the `g77' and `gcc' documentation for
one or more options that is most closely like the one you want to add
(in terms of what kind of effect it has, and so on) to help clarify its
nature.
* *Fortran options* are options that apply only when compiling
Fortran programs. They are accepted by `g77' and `gcc', but they
apply only when compiling Fortran programs.
* *Compiler options* are options that apply when compiling most any
kind of program.
*Fortran options* are listed in the file `egcs/gcc/f/lang-options.h',
which is used during the build of `gcc' to build a list of all options
that are accepted by at least one language's compiler. This list goes
into the `lang_options' array in `gcc/toplev.c', which uses this array
to determine whether a particular option should be offered to the
linked-in front end for processing by calling `lang_option_decode',
which, for `g77', is in `egcs/gcc/f/com.c' and just calls
`ffe_decode_option'.
If the linked-in front end "rejects" a particular option passed to
it, `toplev.c' just ignores the option, because *some* language's
compiler is willing to accept it.
This allows commands like `gcc -fno-asm foo.c bar.f' to work, even
though Fortran compilation does not currently support the `-fno-asm'
option; even though the `f771' version of `lang_decode_option' rejects
`-fno-asm', `toplev.c' doesn't produce a diagnostic because some other
language (C) does accept it.
This also means that commands like `g77 -fno-asm foo.f' yield no
diagnostics, despite the fact that no phase of the command was able to
recognize and process `-fno-asm'--perhaps a warning about this would be
helpful if it were possible.
Code that processes Fortran options is found in `egcs/gcc/f/top.c',
function `ffe_decode_option'. This code needs to check positive and
negative forms of each option.
The defaults for Fortran options are set in their global
definitions, also found in `egcs/gcc/f/top.c'. Many of these defaults
are actually macros defined in `egcs/gcc/f/target.h', since they might
be machine-specific. However, since, in practice, GNU compilers should
behave the same way on all configurations (especially when it comes to
language constructs), the practice of setting defaults in `target.h' is
likely to be deprecated and, ultimately, stopped in future versions of
`g77'.
Accessor macros for Fortran options, used by code in the `g77' FFE,
are defined in `egcs/gcc/f/top.h'.
*Compiler options* are listed in `gcc/toplev.c' in the array
`f_options'. An option not listed in `lang_options' is looked up in
`f_options' and handled from there.
The defaults for compiler options are set in the global definitions
for the corresponding variables, some of which are in `gcc/toplev.c'.
You can set different defaults for *Fortran-oriented* or
*Fortran-reticent* compiler options by changing the source code of
`g77' and rebuilding. How to do this depends on the version of `g77':
`G77 0.5.24 (EGCS 1.1)'
`G77 0.5.25 (EGCS 1.2)'
Change the `lang_init_options' routine in `egcs/gcc/f/com.c'.
(Note that these versions of `g77' perform internal consistency
checking automatically when the `-fversion' option is specified.)
`G77 0.5.23'
`G77 0.5.24 (EGCS 1.0)'
Change the way `f771' handles the `-fset-g77-defaults' option,
which is always provided as the first option when called by `g77'
or `gcc'.
This code is in `ffe_decode_options' in `egcs/gcc/f/top.c'. Have
it change just the variables that you want to default to a
different setting for Fortran compiles compared to compiles of
other languages.
The `-fset-g77-defaults' option is passed to `f771' automatically
because of the specification information kept in
`egcs/gcc/f/lang-specs.h'. This file tells the `gcc' command how
to recognize, in this case, Fortran source files (those to be
preprocessed, and those that are not), and further, how to invoke
the appropriate programs (including `f771') to process those
source files.
It is in `egcs/gcc/f/lang-specs.h' that `-fset-g77-defaults',
`-fversion', and other options are passed, as appropriate, even
when the user has not explicitly specified them. Other "internal"
options such as `-quiet' also are passed via this mechanism.
File: g77.info, Node: Projects, Next: Front End, Prev: Adding Options, Up: Top
Projects
********
If you want to contribute to `g77' by doing research, design,
specification, documentation, coding, or testing, the following
information should give you some ideas. More relevant information
might be available from `ftp://alpha.gnu.org/gnu/g77/projects/'.
* Menu:
* Efficiency:: Make `g77' itself compile code faster.
* Better Optimization:: Teach `g77' to generate faster code.
* Simplify Porting:: Make `g77' easier to configure, build,
and install.
* More Extensions:: Features many users won't know to ask for.
* Machine Model:: `g77' should better leverage `gcc'.
* Internals Documentation:: Make maintenance easier.
* Internals Improvements:: Make internals more robust.
* Better Diagnostics:: Make using `g77' on new code easier.
File: g77.info, Node: Efficiency, Next: Better Optimization, Up: Projects
Improve Efficiency
==================
Don't bother doing any performance analysis until most of the
following items are taken care of, because there's no question they
represent serious space/time problems, although some of them show up
only given certain kinds of (popular) input.
* Improve `malloc' package and its uses to specify more info about
memory pools and, where feasible, use obstacks to implement them.
* Skip over uninitialized portions of aggregate areas (arrays,
`COMMON' areas, `EQUIVALENCE' areas) so zeros need not be output.
This would reduce memory usage for large initialized aggregate
areas, even ones with only one initialized element.
As of version 0.5.18, a portion of this item has already been
accomplished.
* Prescan the statement (in `sta.c') so that the nature of the
statement is determined as much as possible by looking entirely at
its form, and not looking at any context (previous statements,
including types of symbols). This would allow ripping out of the
statement-confirmation, symbol retraction/confirmation, and
diagnostic inhibition mechanisms. Plus, it would result in
much-improved diagnostics. For example, `CALL
some-intrinsic(...)', where the intrinsic is not a subroutine
intrinsic, would result actual error instead of the
unimplemented-statement catch-all.
* Throughout `g77', don't pass line/column pairs where a simple
`ffewhere' type, which points to the error as much as is desired
by the configuration, will do, and don't pass `ffelexToken' types
where a simple `ffewhere' type will do. Then, allow new default
configuration of `ffewhere' such that the source line text is not
preserved, and leave it to things like Emacs' next-error function
to point to them (now that `next-error' supports column, or,
perhaps, character-offset, numbers). The change in calling
sequences should improve performance somewhat, as should not
having to save source lines. (Whether this whole item will
improve performance is questionable, but it should improve
maintainability.)
* Handle `DATA (A(I),I=1,1000000)/1000000*2/' more efficiently,
especially as regards the assembly output. Some of this might
require improving the back end, but lots of improvement in
space/time required in `g77' itself can be fairly easily obtained
without touching the back end. Maybe type-conversion, where
necessary, can be speeded up as well in cases like the one shown
(converting the `2' into `2.').
* If analysis shows it to be worthwhile, optimize `lex.c'.
* Consider redesigning `lex.c' to not need any feedback during
tokenization, by keeping track of enough parse state on its own.
File: g77.info, Node: Better Optimization, Next: Simplify Porting, Prev: Efficiency, Up: Projects
Better Optimization
===================
Much of this work should be put off until after `g77' has all the
features necessary for its widespread acceptance as a useful F77
compiler. However, perhaps this work can be done in parallel during
the feature-adding work.
* Do the equivalent of the trick of putting `extern inline' in front
of every function definition in `libg2c' and #include'ing the
resulting file in `f2c'+`gcc'--that is, inline all
run-time-library functions that are at all worth inlining. (Some
of this has already been done, such as for integral
exponentiation.)
* When doing `CHAR_VAR = CHAR_FUNC(...)', and it's clear that types
line up and `CHAR_VAR' is addressable or not a `VAR_DECL', make
`CHAR_VAR', not a temporary, be the receiver for `CHAR_FUNC'.
(This is now done for `COMPLEX' variables.)
* Design and implement Fortran-specific optimizations that don't
really belong in the back end, or where the front end needs to
give the back end more info than it currently does.
* Design and implement a new run-time library interface, with the
code going into `libgcc' so no special linking is required to link
Fortran programs using standard language features. This library
would speed up lots of things, from I/O (using precompiled formats,
doing just one, or, at most, very few, calls for arrays or array
sections, and so on) to general computing (array/section
implementations of various intrinsics, implementation of commonly
performed loops that aren't likely to be optimally compiled
otherwise, etc.).
Among the important things the library would do are:
* Be a one-stop-shop-type library, hence shareable and usable
by all, in that what are now library-build-time options in
`libg2c' would be moved at least to the `g77' compile phase,
if not to finer grains (such as choosing how list-directed
I/O formatting is done by default at `OPEN' time, for
preconnected units via options or even statements in the main
program unit, maybe even on a per-I/O basis with appropriate
pragma-like devices).
* Probably requiring the new library design, change interface to
normally have `COMPLEX' functions return their values in the way
`gcc' would if they were declared `__complex__ float', rather than
using the mechanism currently used by `CHARACTER' functions
(whereby the functions are compiled as returning void and their
first arg is a pointer to where to store the result). (Don't
append underscores to external names for `COMPLEX' functions in
some cases once `g77' uses `gcc' rather than `f2c' calling
conventions.)
* Do something useful with `doiter' references where possible. For
example, `CALL FOO(I)' cannot modify `I' if within a `DO' loop
that uses `I' as the iteration variable, and the back end might
find that info useful in determining whether it needs to read `I'
back into a register after the call. (It normally has to do that,
unless it knows `FOO' never modifies its passed-by-reference
argument, which is rarely the case for Fortran-77 code.)
File: g77.info, Node: Simplify Porting, Next: More Extensions, Prev: Better Optimization, Up: Projects
Simplify Porting
================
Making `g77' easier to configure, port, build, and install, either
as a single-system compiler or as a cross-compiler, would be very
useful.
* A new library (replacing `libg2c') should improve portability as
well as produce more optimal code. Further, `g77' and the new
library should conspire to simplify naming of externals, such as
by removing unnecessarily added underscores, and to
reduce/eliminate the possibility of naming conflicts, while making
debugger more straightforward.
Also, it should make multi-language applications more feasible,
such as by providing Fortran intrinsics that get Fortran unit
numbers given C `FILE *' descriptors.
* Possibly related to a new library, `g77' should produce the
equivalent of a `gcc' `main(argc, argv)' function when it compiles
a main program unit, instead of compiling something that must be
called by a library implementation of `main()'.
This would do many useful things such as provide more flexibility
in terms of setting up exception handling, not requiring
programmers to start their debugging sessions with `breakpoint
MAIN__' followed by `run', and so on.
* The GBE needs to understand the difference between alignment
requirements and desires. For example, on Intel x86 machines,
`g77' currently imposes overly strict alignment requirements, due
to the back end, but it would be useful for Fortran and C
programmers to be able to override these *recommendations* as long
as they don't violate the actual processor *requirements*.
File: g77.info, Node: More Extensions, Next: Machine Model, Prev: Simplify Porting, Up: Projects
More Extensions
===============
These extensions are not the sort of things users ask for "by name",
but they might improve the usability of `g77', and Fortran in general,
in the long run. Some of these items really pertain to improving `g77'
internals so that some popular extensions can be more easily supported.
* Look through all the documentation on the GNU Fortran language,
dialects, compiler, missing features, bugs, and so on. Many
mentions of incomplete or missing features are sprinkled
throughout. It is not worth repeating them here.
* Consider adding a `NUMERIC' type to designate typeless numeric
constants, named and unnamed. The idea is to provide a
forward-looking, effective replacement for things like the
old-style `PARAMETER' statement when people really need
typelessness in a maintainable, portable, clearly documented way.
Maybe `TYPELESS' would include `CHARACTER', `POINTER', and
whatever else might come along. (This is not really a call for
polymorphism per se, just an ability to express limited, syntactic
polymorphism.)
* Support `OPEN(...,KEY=(...),...)'.
* Support arbitrary file unit numbers, instead of limiting them to 0
through `MXUNIT-1'. (This is a `libg2c' issue.)
* `OPEN(NOSPANBLOCKS,...)' is treated as
`OPEN(UNIT=NOSPANBLOCKS,...)', so a later `UNIT=' in the first
example is invalid. Make sure this is what users of this feature
would expect.
* Currently `g77' disallows `READ(1'10)' since it is an obnoxious
syntax, but supporting it might be pretty easy if needed. More
details are needed, such as whether general expressions separated
by an apostrophe are supported, or maybe the record number can be
a general expression, and so on.
* Support `STRUCTURE', `UNION', `MAP', and `RECORD' fully.
Currently there is no support at all for `%FILL' in `STRUCTURE'
and related syntax, whereas the rest of the stuff has at least
some parsing support. This requires either major changes to
`libg2c' or its replacement.
* F90 and `g77' probably disagree about label scoping relative to
`INTERFACE' and `END INTERFACE', and their contained procedure
interface bodies (blocks?).
* `ENTRY' doesn't support F90 `RESULT()' yet, since that was added
after S8.112.
* Empty-statement handling (10 ;;CONTINUE;;) probably isn't
consistent with the final form of the standard (it was vague at
S8.112).
* It seems to be an "open" question whether a file, immediately
after being `OPEN'ed,is positioned at the beginning, the end, or
wherever--it might be nice to offer an option of opening to
"undefined" status, requiring an explicit absolute-positioning
operation to be performed before any other (besides `CLOSE') to
assist in making applications port to systems (some IBM?) that
`OPEN' to the end of a file or some such thing.
File: g77.info, Node: Machine Model, Next: Internals Documentation, Prev: More Extensions, Up: Projects
Machine Model
=============
This items pertain to generalizing `g77''s view of the machine model
to more fully accept whatever the GBE provides it via its configuration.
* Switch to using `REAL_VALUE_TYPE' to represent floating-point
constants exclusively so the target float format need not be
required. This means changing the way `g77' handles
initialization of aggregate areas having more than one type, such
as `REAL' and `INTEGER', because currently it initializes them as
if they were arrays of `char' and uses the bit patterns of the
constants of the various types in them to determine what to stuff
in elements of the arrays.
* Rely more and more on back-end info and capabilities, especially
in the area of constants (where having the `g77' front-end's IL
just store the appropriate tree nodes containing constants might
be best).
* Suite of C and Fortran programs that a user/administrator can run
on a machine to help determine the configuration for `g77' before
building and help determine if the compiler works (especially with
whatever libraries are installed) after building.
File: g77.info, Node: Internals Documentation, Next: Internals Improvements, Prev: Machine Model, Up: Projects
Internals Documentation
=======================
Better info on how `g77' works and how to port it is needed. Much
of this should be done only after the redesign planned for 0.6 is
complete.
*Note Front End::, which contains some information on `g77'
internals.
File: g77.info, Node: Internals Improvements, Next: Better Diagnostics, Prev: Internals Documentation, Up: Projects
Internals Improvements
======================
Some more items that would make `g77' more reliable and easier to
maintain:
* Generally make expression handling focus more on critical syntax
stuff, leaving semantics to callers. For example, anything a
caller can check, semantically, let it do so, rather than having
`expr.c' do it. (Exceptions might include things like diagnosing
`FOO(I--K:)=BAR' where `FOO' is a `PARAMETER'--if it seems
important to preserve the left-to-right-in-source order of
production of diagnostics.)
* Come up with better naming conventions for `-D' to establish
requirements to achieve desired implementation dialect via
`proj.h'.
* Clean up used tokens and `ffewhere's in `ffeglobal_terminate_1'.
* Replace `sta.c' `outpooldisp' mechanism with `malloc_pool_use'.
* Check for `opANY' in more places in `com.c', `std.c', and `ste.c',
and get rid of the `opCONVERT(opANY)' kludge (after determining if
there is indeed no real need for it).
* Utility to read and check `bad.def' messages and their references
in the code, to make sure calls are consistent with message
templates.
* Search and fix `&ffe...' and similar so that `ffe...ptr...' macros
are available instead (a good argument for wishing this could have
written all this stuff in C++, perhaps). On the other hand, it's
questionable whether this sort of improvement is really necessary,
given the availability of tools such as Emacs and Perl, which make
finding any address-taking of structure members easy enough?
* Some modules truly export the member names of their structures
(and the structures themselves), maybe fix this, and fix other
modules that just appear to as well (by appending `_', though it'd
be ugly and probably not worth the time).
* Implement C macros `RETURNS(value)' and `SETS(something,value)' in
`proj.h' and use them throughout `g77' source code (especially in
the definitions of access macros in `.h' files) so they can be
tailored to catch code writing into a `RETURNS()' or reading from
a `SETS()'.
* Decorate throughout with `const' and other such stuff.
* All F90 notational derivations in the source code are still based
on the S8.112 version of the draft standard. Probably should
update to the official standard, or put documentation of the rules
as used in the code...uh...in the code.
* Some `ffebld_new' calls (those outside of `ffeexpr.c' or inside
but invoked via paths not involving `ffeexpr_lhs' or
`ffeexpr_rhs') might be creating things in improper pools, leading
to such things staying around too long or (doubtful, but possible
and dangerous) not long enough.
* Some `ffebld_list_new' (or whatever) calls might not be matched by
`ffebld_list_bottom' (or whatever) calls, which might someday
matter. (It definitely is not a problem just yet.)
* Probably not doing clean things when we fail to `EQUIVALENCE'
something due to alignment/mismatch or other problems--they end up
without `ffestorag' objects, so maybe the backend (and other parts
of the front end) can notice that and handle like an `opANY' (do
what it wants, just don't complain or crash). Most of this seems
to have been addressed by now, but a code review wouldn't hurt.
File: g77.info, Node: Better Diagnostics, Prev: Internals Improvements, Up: Projects
Better Diagnostics
==================
These are things users might not ask about, or that need to be
looked into, before worrying about. Also here are items that involve
reducing unnecessary diagnostic clutter.
* When `FUNCTION' and `ENTRY' point types disagree (`CHARACTER'
lengths, type classes, and so on), `ANY'-ize the offending `ENTRY'
point and any *new* dummies it specifies.
* Speed up and improve error handling for data when repeat-count is
specified. For example, don't output 20 unnecessary messages
after the first necessary one for:
INTEGER X(20)
CONTINUE
DATA (X(I), J= 1, 20) /20*5/
END
(The `CONTINUE' statement ensures the `DATA' statement is
processed in the context of executable, not specification,
statements.)
File: g77.info, Node: Front End, Next: Diagnostics, Prev: Projects, Up: Top
Front End
*********
This chapter describes some aspects of the design and implementation
of the `g77' front end. Much of the information below applies not to
current releases of `g77', but to the 0.6 rewrite being designed and
implemented as of late May, 1999.
To find about things that are "To Be Determined" or "To Be Done",
search for the string TBD. If you want to help by working on one or
more of these items, email me at <craig@jcb-sc.com>. If you're
planning to do more than just research issues and offer comments, see
`http://www.gnu.org/software/contribute.html' for steps you might need
to take first.
* Menu:
* Overview of Sources::
* Overview of Translation Process::
* Philosophy of Code Generation::
* Two-pass Design::
* Challenges Posed::
* Transforming Statements::
* Transforming Expressions::
* Internal Naming Conventions::
File: g77.info, Node: Overview of Sources, Next: Overview of Translation Process, Up: Front End
Overview of Sources
===================
The current directory layout includes the following:
`{No Value For "srcdir"}/gcc/'
Non-g77 files in gcc
`{No Value For "srcdir"}/gcc/f/'
GNU Fortran front end sources
`{No Value For "srcdir"}/libf2c/'
`libg2c' configuration and `g2c.h' file generation
`{No Value For "srcdir"}/libf2c/libF77/'
General support and math portion of `libg2c'
`{No Value For "srcdir"}/libf2c/libI77/'
I/O portion of `libg2c'
`{No Value For "srcdir"}/libf2c/libU77/'
Additional interfaces to Unix `libc' for `libg2c'
Components of note in `g77' are described below.
`f/' as a whole contains the source for `g77', while `libf2c/'
contains a portion of the separate program `f2c'. Note that the
`libf2c' code is not part of the program `g77', just distributed with
it.
`f/' contains text files that document the Fortran compiler, source
files for the GNU Fortran Front End (FFE), and some other stuff. The
`g77' compiler code is placed in `f/' because it, along with its
contents, is designed to be a subdirectory of a `gcc' source directory,
`gcc/', which is structured so that language-specific front ends can be
"dropped in" as subdirectories. The C++ front end (`g++'), is an
example of this--it resides in the `cp/' subdirectory. Note that the C
front end (also referred to as `gcc') is an exception to this, as its
source files reside in the `gcc/' directory itself.
`libf2c/' contains the run-time libraries for the `f2c' program,
also used by `g77'. These libraries normally referred to collectively
as `libf2c'. When built as part of `g77', `libf2c' is installed under
the name `libg2c' to avoid conflict with any existing version of
`libf2c', and thus is often referred to as `libg2c' when the `g77'
version is specifically being referred to.
The `netlib' version of `libf2c/' contains two distinct libraries,
`libF77' and `libI77', each in their own subdirectories. In `g77',
this distinction is not made, beyond maintaining the subdirectory
structure in the source-code tree.
`libf2c/' is not part of the program `g77', just distributed with it.
It contains files not present in the official (`netlib') version of
`libf2c', and also contains some minor changes made from `libf2c', to
fix some bugs, and to facilitate automatic configuration, building, and
installation of `libf2c' (as `libg2c') for use by `g77' users. See
`libf2c/README' for more information, including licensing conditions
governing distribution of programs containing code from `libg2c'.
`libg2c', `g77''s version of `libf2c', adds Dave Love's
implementation of `libU77', in the `libf2c/libU77/' directory. This
library is distributed under the GNU Library General Public License
(LGPL)--see the file `libf2c/libU77/COPYING.LIB' for more information,
as this license governs distribution conditions for programs containing
code from this portion of the library.
Files of note in `f/' and `libf2c/' are described below:
`f/BUGS'
Lists some important bugs known to be in g77. Or use Info (or GNU
Emacs Info mode) to read the "Actual Bugs" node of the `g77'
documentation:
info -f f/g77.info -n "Actual Bugs"
`f/ChangeLog'
Lists recent changes to `g77' internals.
`libf2c/ChangeLog'
Lists recent changes to `libg2c' internals.
`f/NEWS'
Contains the per-release changes. These include the user-visible
changes described in the node "Changes" in the `g77'
documentation, plus internal changes of import. Or use:
info -f f/g77.info -n News
`f/g77.info*'
The `g77' documentation, in Info format, produced by building
`g77'.
All users of `g77' (not just installers) should read this, using
the `more' command if neither the `info' command, nor GNU Emacs
(with its Info mode), are available, or if users aren't yet
accustomed to using these tools. All of these files are readable
as "plain text" files, though they're easier to navigate using
Info readers such as `info' and GNU Emacs Info mode.
If you want to explore the FFE code, which lives entirely in `f/',
here are a few clues. The file `g77spec.c' contains the `g77'-specific
source code for the `g77' command only--this just forms a variant of the
`gcc' command, so, just as the `gcc' command itself does not contain
the C front end, the `g77' command does not contain the Fortran front
end (FFE). The FFE code ends up in an executable named `f771', which
does the actual compiling, so it contains the FFE plus the `gcc' back
end (GBE), the latter to do most of the optimization, and the code
generation.
The file `parse.c' is the source file for `yyparse()', which is
invoked by the GBE to start the compilation process, for `f771'.
The file `top.c' contains the top-level FFE function `ffe_file' and
it (along with top.h) define all `ffe_[a-z].*', `ffe[A-Z].*', and
`FFE_[A-Za-z].*' symbols.
The file `fini.c' is a `main()' program that is used when building
the FFE to generate C header and source files for recognizing keywords.
The files `malloc.c' and `malloc.h' comprise a memory manager that
defines all `malloc_[a-z].*', `malloc[A-Z].*', and `MALLOC_[A-Za-z].*'
symbols.
All other modules named XYZ are comprised of all files named
`XYZ*.EXT' and define all `ffeXYZ_[a-z].*', `ffeXYZ[A-Z].*', and
`FFEXYZ_[A-Za-z].*' symbols. If you understand all this,
congratulations--it's easier for me to remember how it works than to
type in these regular expressions. But it does make it easy to find
where a symbol is defined. For example, the symbol
`ffexyz_set_something' would be defined in `xyz.h' and implemented
there (if it's a macro) or in `xyz.c'.
The "porting" files of note currently are:
`proj.c'
`proj.h'
This defines the "language" used by all the other source files,
the language being Standard C plus some useful things like
`ARRAY_SIZE' and such.
`target.c'
`target.h'
These describe the target machine in terms of what data types are
supported, how they are denoted (to what C type does an
`INTEGER*8' map, for example), how to convert between them, and so
on. Over time, versions of `g77' rely less on this file and more
on run-time configuration based on GBE info in `com.c'.
`com.c'
`com.h'
These are the primary interface to the GBE.
`ste.c'
`ste.h'
This contains code for implementing recognized executable
statements in the GBE.
`src.c'
`src.h'
These contain information on the format(s) of source files (such
as whether they are never to be processed as case-insensitive with
regard to Fortran keywords).
If you want to debug the `f771' executable, for example if it
crashes, note that the global variables `lineno' and `input_filename'
are usually set to reflect the current line being read by the lexer
during the first-pass analysis of a program unit and to reflect the
current line being processed during the second-pass compilation of a
program unit.
If an invocation of the function `ffestd_exec_end' is on the stack,
the compiler is in the second pass, otherwise it is in the first.
(This information might help you reduce a test case and/or work
around a bug in `g77' until a fix is available.)
File: g77.info, Node: Overview of Translation Process, Next: Philosophy of Code Generation, Prev: Overview of Sources, Up: Front End
Overview of Translation Process
===============================
The order of phases translating source code to the form accepted by
the GBE is:
1. Stripping punched-card sources (`g77stripcard.c')
2. Lexing (`lex.c')
3. Stand-alone statement identification (`sta.c')
4. Parsing (`stb.c' and `expr.c')
5. Constructing (`stc.c')
6. Collecting (`std.c')
7. Expanding (`ste.c')
To get a rough idea of how a particularly twisted Fortran statement
gets treated by the passes, consider:
FORMAT(I2 4H)=(J/
& I3)
The job of `lex.c' is to know enough about Fortran syntax rules to
break the statement up into distinct lexemes without requiring any
feedback from subsequent phases:
`FORMAT'
`('
`I24H'
`)'
`='
`('
`J'
`/'
`I3'
`)'
The job of `sta.c' is to figure out the kind of statement, or, at
least, statement form, that sequence of lexemes represent.
The sooner it can do this (in terms of using the smallest number of
lexemes, starting with the first for each statement), the better,
because that leaves diagnostics for problems beyond the recognition of
the statement form to subsequent phases, which can usually better
describe the nature of the problem.
In this case, the `=' at "level zero" (not nested within parentheses)
tells `sta.c' that this is an *assignment-form*, not `FORMAT',
statement.
An assignment-form statement might be a statement-function
definition or an executable assignment statement.
To make that determination, `sta.c' looks at the first two lexemes.
Since the second lexeme is `(', the first must represent an array
for this to be an assignment statement, else it's a statement function.
Either way, `sta.c' hands off the statement to `stb.c' (either its
statement-function parser or its assignment-statement parser).
`stb.c' forms a statement-specific record containing the pertinent
information. That information includes a source expression and, for an
assignment statement, a destination expression. Expressions are parsed
by `expr.c'.
This record is passed to `stc.c', which copes with the implications
of the statement within the context established by previous statements.
For example, if it's the first statement in the file or after an
`END' statement, `stc.c' recognizes that, first of all, a main program
unit is now being lexed (and tells that to `std.c' before telling it
about the current statement).
`stc.c' attaches whatever information it can, usually derived from
the context established by the preceding statements, and passes the
information to `std.c'.
`std.c' saves this information away, since the GBE cannot cope with
information that might be incomplete at this stage.
For example, `I3' might later be determined to be an argument to an
alternate `ENTRY' point.
When `std.c' is told about the end of an external (top-level)
program unit, it passes all the information it has saved away on
statements in that program unit to `ste.c'.
`ste.c' "expands" each statement, in sequence, by constructing the
appropriate GBE information and calling the appropriate GBE routines.
Details on the transformational phases follow. Keep in mind that
Fortran numbering is used, so the first character on a line is column 1,
decimal numbering is used, and so on.
* Menu:
* g77stripcard::
* lex.c::
* sta.c::
* stb.c::
* expr.c::
* stc.c::
* std.c::
* ste.c::
* Gotchas (Transforming)::
* TBD (Transforming)::
File: g77.info, Node: g77stripcard, Next: lex.c, Up: Overview of Translation Process
g77stripcard
------------
The `g77stripcard' program handles removing content beyond column 72
(adjustable via a command-line option), optionally warning about that
content being something other than trailing whitespace or Fortran
commentary.
This program is needed because `lex.c' doesn't pay attention to
maximum line lengths at all, to make it easier to maintain, as well as
faster (for sources that don't depend on the maximum column length
vis-a-vis trailing non-blank non-commentary content).
Just how this program will be run--whether automatically for old
source (perhaps as the default for `.f' files?)--is not yet determined.
In the meantime, it might as well be implemented as a typical UNIX
pipe.
It should accept a `-fline-length-N' option, with the default line
length set to 72.
When the text it strips off the end of a line is not blank (not
spaces and tabs), it should insert an additional comment line
(beginning with `!', so it works for both fixed-form and free-form
files) containing the text, following the stripped line. The inserted
comment should have a prefix of some kind, TBD, that distinguishes the
comment as representing stripped text. Users could use that to `sed'
out such lines, if they wished--it seems silly to provide a
command-line option to delete information when it can be so easily
filtered out by another program.
(This inserted comment should be designed to "fit in" well with
whatever the Fortran community is using these days for preprocessor,
translator, and other such products, like OpenMP. What that's all
about, and how `g77' can elegantly fit its special comment conventions
into it all, is TBD as well. We don't want to reinvent the wheel here,
but if there turn out to be too many conflicting conventions, we might
have to invent one that looks nothing like the others, but which offers
their host products a better infrastructure in which to fit and coexist
peacefully.)
`g77stripcard' probably shouldn't do any tab expansion or other
fancy stuff. People can use `expand' or other pre-filtering if they
like. The idea here is to keep each stage quite simple, while providing
excellent performance for "normal" code.
(Code with junk beyond column 73 is not really "normal", as it comes
from a card-punch heritage, and will be increasingly hard for
tomorrow's Fortran programmers to read.)
File: g77.info, Node: lex.c, Next: sta.c, Prev: g77stripcard, Up: Overview of Translation Process
lex.c
-----
To help make the lexer simple, fast, and easy to maintain, while
also having `g77' generally encourage Fortran programmers to write
simple, maintainable, portable code by maximizing the performance of
compiling that kind of code:
* There'll be just one lexer, for both fixed-form and free-form
source.
* It'll care about the form only when handling the first 7 columns of
text, stuff like spaces between strings of alphanumerics, and how
lines are continued.
Some other distinctions will be handled by subsequent phases, so
at least one of them will have to know which form is involved.
For example, `I = 2 . 4' is acceptable in fixed form, and works in
free form as well given the implementation `g77' presently uses.
But the standard requires a diagnostic for it in free form, so the
parser has to be able to recognize that the lexemes aren't
contiguous (information the lexer *does* have to provide) and that
free-form source is being parsed, so it can provide the diagnostic.
The `g77' lexer doesn't try to gather `2 . 4' into a single lexeme.
Otherwise, it'd have to know a whole lot more about how to parse
Fortran, or subsequent phases (mainly parsing) would have two
paths through lots of critical code--one to handle the lexeme `2',
`.', and `4' in sequence, another to handle the lexeme `2.4'.
* It won't worry about line lengths (beyond the first 7 columns for
fixed-form source).
That is, once it starts parsing the "statement" part of a line
(column 7 for fixed-form, column 1 for free-form), it'll keep
going until it finds a newline, rather than ignoring everything
past a particular column (72 or 132).
The implication here is that there shouldn't *be* anything past
that last column, other than whitespace or commentary, because
users using typical editors (or viewing output as typically
printed) won't necessarily know just where the last column is.
Code that has "garbage" beyond the last column (almost certainly
only fixed-form code with a punched-card legacy, such as code
using columns 73-80 for "sequence numbers") will have to be run
through `g77stripcard' first.
Also, keeping track of the maximum column position while also
watching out for the end of a line *and* while reading from a file
just makes things slower. Since a file must be read, and watching
for the end of the line is necessary (unless the typical input
file was preprocessed to include the necessary number of trailing
spaces), dropping the tracking of the maximum column position is
the only way to reduce the complexity of the pertinent code while
maintaining high performance.
* ASCII encoding is assumed for the input file.
Code written in other character sets will have to be converted
first.
* Tabs (ASCII code 9) will be converted to spaces via the
straightforward approach.
Specifically, a tab is converted to between one and eight spaces
as necessary to reach column N, where dividing `(N - 1)' by eight
results in a remainder of zero.
* Linefeeds (ASCII code 10) mark the ends of lines.
* A carriage return (ASCII code 13) is accept if it immediately
precedes a linefeed, in which case it is ignored.
Otherwise, it is rejected (with a diagnostic).
* Any other characters other than the above that are not part of the
GNU Fortran Character Set (*note Character Set::.) are rejected
with a diagnostic.
This includes backspaces, form feeds, and the like.
(It might make sense to allow a form feed in column 1 as long as
that's the only character on a line. It certainly wouldn't seem
to cost much in terms of performance.)
* The end of the input stream (EOF) ends the current line.
* The distinction between uppercase and lowercase letters will be
preserved.
It will be up to subsequent phases to decide to fold case.
Current plans are to permit any casing for Fortran (reserved)
keywords while preserving casing for user-defined names. (This
might not be made the default for `.f' files, though.)
Preserving case seems necessary to provide more direct access to
facilities outside of `g77', such as to C or Pascal code.
Names of intrinsics will probably be matchable in any case,
However, there probably won't be any option to require a
particular mixed-case appearance of intrinsics (as there was for
`g77' prior to version 0.6), because that's painful to maintain,
and probably nobody uses it.
(How `external SiN; r = sin(x)' would be handled is TBD. I think
old `g77' might already handle that pretty elegantly, but whether
we can cope with allowing the same fragment to reference a
*different* procedure, even with the same interface, via `s =
SiN(r)', needs to be determined. If it can't, we need to make
sure that when code introduces a user-defined name, any intrinsic
matching that name using a case-insensitive comparison is "turned
off".)
* Backslashes in `CHARACTER' and Hollerith constants are not allowed.
This avoids the confusion introduced by some Fortran compiler
vendors providing C-like interpretation of backslashes, while
others provide straight-through interpretation.
Some kind of lexical construct (TBD) will be provided to allow
flagging of a `CHARACTER' (but probably not a Hollerith) constant
that permits backslashes. It'll necessarily be a prefix, such as:
PRINT *, C'This line has a backspace \b here.'
PRINT *, F'This line has a straight backslash \ here.'
Further, command-line options might be provided to specify that
one prefix or the other is to be assumed as the default for
`CHARACTER' constants.
However, it seems more helpful for `g77' to provide a program that
converts prefix all constants (or just those containing
backslashes) with the desired designation, so printouts of code
can be read without knowing the compile-time options used when
compiling it.
If such a program is provided (let's name it `g77slash' for now),
then a command-line option to `g77' should not be provided.
(Though, given that it'll be easy to implement, it might be hard
to resist user requests for it "to compile faster than if we have
to invoke another filter".)
This program would take a command-line option to specify the
default interpretation of slashes, affecting which prefix it uses
for constants.
`g77slash' probably should automatically convert Hollerith
constants that contain slashes to the appropriate `CHARACTER'
constants. Then `g77' wouldn't have to define a prefix syntax for
Hollerith constants specifying whether they want C-style or
straight-through backslashes.
The above implements nearly exactly what is specified by *Note
Character Set::., and *Note Lines::., except it also provides automatic
conversion of tabs and ignoring of newline-related carriage returns.
It also effects the "pure visual" model, by which is meant that a
user viewing his code in a typical text editor (assuming it's not
preprocessed via `g77stripcard' or similar) doesn't need any special
knowledge of whether spaces on the screen are really tabs, whether
lines end immediately after the last visible non-space character or
after a number of spaces and tabs that follow it, or whether the last
line in the file is ended by a newline.
Most editors don't make these distinctions, the ANSI FORTRAN 77
standard doesn't require them to, and it permits a standard-conforming
compiler to define a method for transforming source code to "standard
form" however it wants.
So, GNU Fortran defines it such that users have the best chance of
having the code be interpreted the way it looks on the screen of the
typical editor.
(Fancy editors should *never* be required to correctly read code
written in classic two-dimensional-plaintext form. By correct reading
I mean ability to read it, book-like, without mistaking text ignored by
the compiler for program code and vice versa, and without having to
count beyond the first several columns. The vague meaning of ASCII
TAB, among other things, complicates this somewhat, but as long as
"everyone", including the editor, other tools, and printer, agrees
about the every-eighth-column convention, the GNU Fortran "pure visual"
model meets these requirements. Any language or user-visible source
form requiring special tagging of tabs, the ends of lines after
spaces/tabs, and so on, is broken by this definition. Fortunately,
Fortran *itself* is not broken, even if most vendor-supplied defaults
for their Fortran compilers *are* in this regard.)
Further, this model provides a clean interface to whatever
preprocessors or code-generators are used to produce input to this
phase of `g77'. Mainly, they need not worry about long lines.
File: g77.info, Node: sta.c, Next: stb.c, Prev: lex.c, Up: Overview of Translation Process
sta.c
-----
File: g77.info, Node: stb.c, Next: expr.c, Prev: sta.c, Up: Overview of Translation Process
stb.c
-----
File: g77.info, Node: expr.c, Next: stc.c, Prev: stb.c, Up: Overview of Translation Process
expr.c
------
File: g77.info, Node: stc.c, Next: std.c, Prev: expr.c, Up: Overview of Translation Process
stc.c
-----
File: g77.info, Node: std.c, Next: ste.c, Prev: stc.c, Up: Overview of Translation Process
std.c
-----
File: g77.info, Node: ste.c, Next: Gotchas (Transforming), Prev: std.c, Up: Overview of Translation Process
ste.c
-----